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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Rasmussen, Grethe 

Mikkelsen, Jan Moller 
Schulein, Martin 
Patkar, Shankant A. 
Hagen, Fred 

(ii) TITLE OF INVENTION: A Cellulase Preparation Comprising an 
Endoglucanase Enzyme 

(iii) NUMBER OF SEQUENCES: 33 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Novo Nordisk of North America, Inc. 

(B) STREET: 4 05 Lexington Avenue, 64th Floor 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: United States of America 

(F) 2IP: 10174-6401 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/3 89,423 

(B) FILING DATE: 14-FEB-1995 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Lambiris, Elias J. 

(B) REGISTRATION NUMBER: 33,728 

(C) REFERENCE/DOCKET NUMBER: 3469. 214 -US 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 212-867-0123 

(B) TELEFAX: 212-878-9655 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Humicola insolens 

(B) STRAIN: DSM 1800 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 73.. 924 

(ix) FEATURE: 

(A) NAME/KEY: sigjpeptide 

(B) LOCATION: 10. .72 

{ ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 10,. 924 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

GGATCCAAG ATG CGT TCC TCC CCC CTC CTC CCG TCC GCC GTT GTG GCC 
Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala 
-21 -20 -15 -10 

GCC CTG CCG GTG TTG GCC CTT GCC GCT GAT GGC AGG TCC ACC CGC TAC 
Ala Leu Pro Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr 
-5 1 5 , 

TGG GAC TGC TGC AAG CCT TCG TGC GGC TGG GCC AAG AAG GCT CCC GTG 
Trp Asp Cys Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val 
10 15 20 

AAC CAG CCT GTC TTT TCC TGC AAC GCC AAC TTC CAG CGT ATC ACG GAC 
Asn Gin Pro Val Phe Ser Cys Asn Ala Asn Phe Gin Arg He Thr Asp 
25 30 35 40 

TTC GAC GCC AAG TCC GGC TGC GAG CCG GGC GGT GTC GCC TAC TCG TGC 
Phe Asp Ala Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys 
45 50 55 

GCC GAC CAG ACC CCA TGG GCT GTG AAC GAC GAC TTC GCG CTC GGT TTT 
Ala Asp Gin Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe 
60 65 70 

GCT GCC ACC TCT ATT GCC GGC AGC AAT GAG GCG GGC TGG TGC TGC GCC 
Ala Ala Thr Ser He Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala 
75 80 85 

TGC TAC GAG CTC ACC TTC ACA TCC GGT CCT GTT GCT GGC AAG AAG ATG 
Cys Tyr Glu Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met 
90 95 100 

GTC GTC CAG TCC ACC AGC ACT GGC GGT GAT CTT GGC AGC AAC CAC TTC 
Val Val Gin Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe 
lOS 110 115 120 



GAT CTC AAC ATC CCC GGC GGC GGC GTC GGC ATC TTC GAC GGA TGC ACT 
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Asp Leu Asn lie Pro Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr 
125 130 135 

CCC GAG TTC GGC GGT CTG CCC GGC CAG CGC TAG GGC GGC ATC TCG TCC 52 8 

Pro Gin Phe Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly He Ser Ser 
140 145 150 

CGC AAC GAG TGC GAT CGG TTC CCC GAC GCC CTC AAG CCC GGC TGC TAG 576 
Arg Asn Glu Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr 
155 160 165 

TGG CGC TTC GAC TGG TTC AAG AAC GCC GAC AAT CCG AGC TTC AGC TTC 624 
Trp Arg Phe Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe 
170 175 180 

CGT CAG GTC CAG TGC CCA GCC GAG CTC GTC GCT CGC ACC GGA TGC CGC 672 
Arg Gin Val Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg 
185 190 195 200 

CGC AAC GAC GAC GGC AAC TTC CCT GCC GTC CAG ATC CCC TCC AGC AGC 72 0 

Arg Asn Asp Asp Gly Asn Phe Pro Ala Val Gin lie Pro Ser Ser Ser 

205 210 215 

ACC AGC TCT CCG GTC AAC CAG CCT ACC AGC ACC AGC ACC ACG TCC ACC 768 
Thr Ser Ser Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr 
220 225 230 

TCC ACC ACC TCG AGC CCG CCA GTC CAG CCT ACG ACT CCC AGC GGC TGC 816 
Ser Thr Thr Ser Ser Pro Pro Val Gin Pro Thr Thr Pro Ser Gly Cys 
235 240 245 

ACT GCT GAG AGG TGG GCT CAG TGC GGC GGC AAT GGC TGG AGC GGC TGC 864 
Thr Ala Glu Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys 
250 255 260 

ACC ACC TGC GTC GCT GGC AGC ACT TGC ACG AAG ATT AAT GAC TGG TAG 912 
Thr Thr Cys Val Ala Gly Ser Thr Cys Thr Lys He Asn Asp Trp Tyr 
265 270 275 280 

CAT CAG TGC CTG TAGACGCAGG GCAGCTTGAG GGCCTTACTG GTGGCCGCAA 964 
His Gin Cys Leu 



CGAAATGACA CTCCCAATCA CTGTATTAGT TCTTGTACAT AATTTCGTCA TCCCTCCAGG 1024 
GATTGTCACA TAAATGCAAT GAGGAACAAT GAGTAC 1060 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 05 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala Ala Leu Pro 
-21 -20 -15 -10 

Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr Trp Asp Cys 

-5 15 10 

Cys Lys Pro Ser Cys Gly Trp Ala Lys Lys Ala Pro Val Asn Gin Pro 
15 20 25 

Val Phe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp Phe Asp Ala 
30 35 40 

Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys Ala Asp Gin 
45 50 55 

Thr Pro Trp Ala Val Asn Asp Asp Phe Ala Leu Gly Phe Ala Ala Thr 
60 65 70 75 

Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala Cys Tyr Glu 
80 85 90 

Leu Thr Phe Thr Ser Gly Pro Val Ala Gly Lys Lys Met Val Val Gin 
95 100 105 

Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Phe Asp Leu Asn 
110 115 120 

lie Pro Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr Pro Gin Phe 
125 130 135 

Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly lie Ser Ser Arg Asn Glu 
140 145 150 155 

Cys Asp Arg Phe Pro Asp Ala Leu Lys Pro Gly Cys Tyr Trp Arg Phe 
160 165 170 

Asp Trp Phe Lys Asn Ala Asp Asn Pro Ser Phe Ser Phe Arg Gin Val 
175 180 185 

Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg Arg Asn Asp 
190 195 200 

Asp Gly Asn Phe Pro Ala Val Gin lie Pro Ser Ser Ser Thr Ser Ser 
205 210 215 

Pro Val Asn Gin Pro Thr Ser Thr Ser Thr Thr Ser Thr Ser Thr Thr 
220 225 230 235 

Ser Ser Pro Pro Val Gin Pro Thr Thr Pro Ser Gly Cys Thr Ala Glu 
240 245 250 

Arg Trp Ala Gin Cys Gly Gly Asn Gly Trp Ser Gly Cys Thr Thr Cys 
255 260 265 
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Val Ala Gly Ser Thr Cys Thr Lys He Asn Asp Trp Tyr His Gin Cys 
270 275 280 



Leu 



(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1473 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) AMTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Fusarium oxysporum 

(B) STRAIN: DSM 2 672 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 97. .1224 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
GAATTCGCGG CCGCTCATTC ACTTCATTCA TTCTTTAGAA TTACATACAC TCTCTTTCAA 60 
AACAGTCACT CTTTAAACAA AACAACTTTT GCAACA ATG CGA TCT TAC ACT CTT 114 



Met Arg Ser Tyr Thr Leu 
1 5 



CTC GCC CTG GCC GGC OCT CTC GCC GTG AGT GCT GCT TCT GGA AGC GGT 
Leu Ala Leu Ala Gly Pro Leu Ala Val Ser Ala Ala Ser Gly Ser Gly 
10 15 20 
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CAC TCT ACT CGA TAC TGG GAT TGC TGC AAG OCT TCT TGC TCT TGG AGC 
His Ser Thr Arg Tyr Trp Asp Cys Cys Lys Pro Ser Cys Ser Trp Ser 
25 30 35 



210 



GGA AAG GCT GCT GTC AAC GCC CCT GCT TTA ACT TGT GAT AAG AAC GAC 
Gly Lys Ala Ala Val Asn Ala Pro Ala Leu Thr Cys Asp Lys Asn Asp 
40 45 50 
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AAC CCC ATT TCC AAC ACC AAT GCT GTC AAC GGT TGT GAG GGT GGT GGT 
Asn Pro He Ser Asn Thr Asn Ala Val Asn Gly Cys Glu Gly Gly Gly 
55 60 65 70 
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TCT GCT TAT GCT TGC ACC AAC TAC TCT CCC TGG GCT GTC AAC GAT GAG 
Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro Trp Ala Val Asn Asp Glu 
75 80 85 



354 
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CTT GCC TAG GGT TTC GCT GCT ACC AAG ATC TCC GGT GGC TCC GAG GCC 4 02 

Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie Ser Gly Gly Ser Glu Ala 
50 95 100 

AGC TGG TGC TGT GCT TGC TAT GCT TTG ACC TTC ACC ACT GGC CCC GTC 450 
Ser Trp Cys Cys Ala Cys Tyr Ala Leu Thr Phe Thr Thr Gly Pro Val 
105 110 115 

AAG GGC AAG AAG ATG ATC GTC CAG TCC ACC AAC ACT GGA GGT GAT CTC 4 98 

Lys Gly Lys Lys Met He Val Gin Ser Thr Asn Thr Gly Gly Asp Leu 
120 125 130 

GGC GAC AAC CAC TTC GAT CTC ATG ATG CCC GGC GGT GGT GTC GGT ATC 54 6 

Gly Asp Asn His Phe Asp Leu Met Met Pro Gly Gly Gly Val Gly He 
135 140 145 150 

TTC GAC GGC TGC ACC TCT GAG TTC GGC AAG GCT CTC GGC GGT GCC CAG 594 
Phe Asp Gly Cys Thr Ser Glu Phe Gly Lys Ala Leu Gly Gly Ala Gin 
155 160 165 

TAG GGC GGT ATC TCC TCC CGA AGC GAA TGT GAT AGC TAC CCC GAG CTT 642 
Tyr Gly Gly He Ser Ser Arg Ser Glu Cys Asp Ser Tyr Pro Glu Leu 
170 175 180 

CTC AAG GAC GGT TGC CAC TGG CGA TTC GAC TGG TTC GAG AAC GCC GAC 690 
Leu Lys Asp Gly Cys His Trp Arg Phe Asp Trp Phe Glu Asn Ala Asp 
185 190 195 

AAC CCT GAC TTC ACC TTT GAG CAG GTT CAG TGC CCC AAG GCT CTC CTC 73 8 

Asn Pro Asp Phe Thr Phe Glu Gin Val Gin Cys Pro Lys Ala Leu Leu 
200 205 210 

GAC ATC AGT GGA TGC AAG CGT GAT GAC GAC TCC AGC TTC CCT GCC TTC 786 
Asp He Ser Gly Cys Lys Arg Asp Asp Asp Ser Ser Phe Pro Ala Phe 
215 220 225 230 

AAG GTT GAT ACC TCG GCC AGC AAG CCC CAG CCC TCC AGC TCC GCT AAG 834 
Lys Val Asp Thr Ser Ala Ser Lys Pro Gin Pro Ser Ser Ser Ala Lys 
235 240 245 



AAG ACC ACC TCC GCT GCT GCT GCC GCT CAG CCC CAG AAG ACC AAG GAT 
Lys Thr Thr Ser Ala Ala Ala Ala Ala Gin Pro Gin Lys Thr Lys Asp 
250 255 260 
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TCC GCT CCT GTT GTC CAG AAG TCC TCC ACC AAG CCT GCC GCT CAG CCC 93 0 

Ser Ala Pro Val Val Gin Lys Ser Ser Thr Lys Pro Ala Ala Gin Pro 
265 270 275 

GAG CCT ACT AAG CCC GCC GAC AAG CCC CAG ACC GAC AAG CCT GTC GCC 978 
Glu Pro Thr Lys Pro Ala Asp Lys Pro Gin Thr Asp Lys Pro Val Ala 

280 285 290 

ACC AAG CCT GCT GCT ACC AAG CCC GTC CAA CCT GTC AAC AAG CCC AAG 102 6 

Thr Lys Pro Ala Ala Thr Lys Pro Val Gin Pro Val Asn Lys Pro Lys 
295 300 305 310 
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ACA ACC CAG AAG GTC CGT GGA ACC AAA ACC CGA GGA AGC TGC CCG GCC 1074 
Thr Thr Gin Lys Val Arg Gly Thr Lys Thr Arg Gly Ser Cys Pro Ala 
315 320 325 

AAG ACT GAG GCT ACC GCC AAG GCC TCC GTT GTC CCT GCT TAT TAG CAG 1122 
Lys Thr Asp Ala Thr Ala Lys Ala Ser Val Val Pro Ala Tyr Tyr Gin 
330 335 340 

TGT GGT GGT TCC AAG TCC GCT TAT CCC AAC GGC AAC CTC GCT TGC GCT 1170 
Cys Gly Gly Ser Lys Ser Ala Tyr Pro Asn Gly Asn Leu Ala Cys Ala 
345 350 355 

ACT GGA AGC AAG TGT GTC AAG CAG AAC GAG TAC TAC TCC CAG TGT GTC 1218 
Thr Gly Ser Lys Cys Val Lys Gin Asn Glu Tyr Tyr Ser Gin Cys Val 
360 365 370 

CCC AAC TAAATGGTAG ATCCATCGGT TGTGGAAGAG ACTATGCGTC TCAGAAGGGA 1274 

Pro Asn 

375 

TCCTCTCATG AGCAGGCTTG TCATTGTATA GCATGGCATC CTGGACCAAG TGTTCGACCC 1334 
TTGTTGTACA TAGTATATCT TCATTGTATA TATTTAGACA CATAGATAGC CTCTTGTCAG 1394 
CGACAACTGG CTACAAAAGA CTTGGCAGGC TTGTTCAATA TTGACACAGT TTCCTCCATA 1454 
AAAAAAAAAA AAAAAAAAA 

(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Arg Ser Tyr Thr Leu Leu Ala Leu Ala Gly Pro Leu Ala Val Ser 
15 10 15 

Ala Ala Ser Gly Ser Gly His Ser Thr Arg Tyr Trp Asp Cys Cys Lys 
20 25 30 

Pro Ser Cys Ser Trp Ser Gly Lys Ala Ala Val Asn Ala Pro Ala Leu 
35 40 45 

Thr Cys Asp Lys Asn Asp Asn Pro He Ser Asn Thr Asn Ala Val Asn 
50 55 60 

Gly Cys Glu Gly Gly Gly Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro 
65 70 75 80 

Trp Ala Val Asn Asp Glu Leu Ala Tyr Gly Phe Ala Ala Thr Lys He 
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Ser Gly Gly Ser Glu Ala Ser Trp Cys Cys Ala Cys Tyr Ala Leu Thr 
100 105 110 

Phe Thr Thr Gly Pro Val Lys Gly Lys Lys Met He Val Gin Ser Thr 

115 120 125 

Asn Thr Gly Gly Asp Leu Gly Asp Asn His Phe Asp Leu Met Met Pro 

130 135 140 



Gly Gly Gly Val Gly He Phe Asp Gly Cys Thr Ser Glu Phe Gly Lys 
145 150 155 
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Ala Leu Gly Gly Ala Gin Tyr Gly Gly He Ser Ser Arg Ser Glu Cys 
165 170 175 

Asp Ser Tyr Pro Glu Leu Leu Lys Asp Gly Cys His Trp Arg Phe Asp 
180 185 190 

Trp Phe Glu Asn Ala Asp Asn Pro Asp Phe Thr Phe Glu Gin Val Gin 
195 200 205 

Cys Pro Lys Ala Leu Leu Asp He Ser Gly Cys Lys Arg Asp Asp Asp 
210 215 220 

Ser Ser Phe Pro Ala Phe Lys Val Asp Thr Ser Ala Ser Lys Pro Gin 
225 230 235 240 

Pro Ser Ser Ser Ala Lys Lys Thr Thr Ser Ala Ala Ala Ala Ala Gin 
245 250 255 

Pro Gin Lys Thr Lys Asp Ser Ala Pro Val Val Gin Lys Ser Ser Thr 
260 • 265 270 

Lys Pro Ala Ala Gin Pro Glu Pro Thr Lys Pro Ala Asp Lys Pro Gin 
275 280 285 

Thr Asp Lys Pro Val Ala Thr Lys Pro Ala Ala Thr Lys Pro Val Gin 
290 295 300 

Pro Val Asn Lys Pro Lys Thr Thr Gin Lys Val Arg Gly Thr Lys Thr 
305 310 315 320 

Arg Gly Ser Cys Pro Ala Lys Thr Asp Ala Thr Ala Lys Ala Ser Val 
325 330 335 

Val Pro Ala Tyr Tyr Gin Cys Gly Gly Ser Lys Ser Ala Tyr Pro Asn 
340 345 350 

Gly Asn Leu Ala Cys Ala Thr Gly Ser Lys Cys Val Lys Gin Asn Glu 
355 360 365 



Tyr Tyr Ser Gin Cys Val Pro Asn 
370 375 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
AGCTGCGGCC GCAGGCCGCG GAGGCCA 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 
AGCTTGGCCT CCGCGGCCTG CGGCCGC 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
AATTCGCGGC CGCGGCCATG GAGGCC 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
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AATTGGCCTC CATGGCCGCG GCCGCG 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
AAYGCYGACA AAYCC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
AACGAYGAYG GNAAYTTCCC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
AAYGAYTGGT ACCAYCARTG 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
GCGCCAGTAG CAGCCGGGCT TGAGGG 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
ACGTCTCAAC TCGGATCCAA GATGCGTT 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
CTCAACTCTG ATCAAGATGC GTTCC 



(2) INFORMATION FOR SEQ ID N0:15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
TGTCGACCAG TAAGGCCCTC AAGCTG 



(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA {genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
GACAGAGCAC AGAATTCACT AGTGAGCTCT 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
TGGGAYTGYT GYAARCC 



(2) INFORMATION FOR SEQ ID N0:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
AGGGAGACCG GAATTCTGGG AYTGYTGYAA RCC 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

CCNGGNGGNG GNGTNGG 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
AGGGAGACCG GAATTCCCNG GNGGNGGNGT NGG 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
ACNAYCATNK TYTTNCC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
GACAGAGCAC AGAATTCACN AYCATNKTYT TNCC 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
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NGGRTTRTCN GCNKYYTYRA ACCA 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
GACAGAGCAC AGAATTCNGG RTTRTCNGCN KYYTYRAACC A 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
GGGGTAGCTA TCACATTCGC TTCGGGAGGA GATACCGCCG TA 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CTTCTTGCTC TTGGAGCGGA AAGGCTGCTG TCAACGCCCC TG 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
TGTACGCATG TAACATTA 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
CTGCACAATA TTTCAAGC 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GGGGTAGCTA TCACATTCGC TTCGGGAGGA GATACCGCCG 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CTTCTTGCTC TTGGAGCGGA AAGGCTGCTG TCAACGCCCC 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 



62 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
AGCTTCTCAA GGACGGTT 



(2) INFORMATION FOR SEQ ID NO: 32: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32 
AACAAGGGTC GAACACTT 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 17 base pairs 
(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
CD) TOPOLOGY: linear 

Cii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33 

CCAGAAGACC AAGGATT 
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SEQC3MCE ixsnm 



(1) GENERftL INFCEaflZ^CN: 

5 

(i) iffiPIICfiNT: HKM) WRDISK I^S, N N 
(ii) TITIE OP INVENTION: A CeXlulase Preparation 
10 (iii) NCMHEIR OF SEQPENCES: 4 
(iv) OGE^!RESF0NDENCE AOCeESS: 

(A) AHRESSEE: MCfVO IKMESK A/s, Patent D^artmeant 

(B) SIKEEr: Novo Alle 
15 (O canY: Bagsvaerd 

(E) OCXmiRY: DENMaRK 

(F) ZIP: nK-2880 

(V) CXMTOTER REZUaBIE POEM: 
20 (A) MEDIUM TOPE: Flcppy disk 

(B) CQME13TER: IBM PC ccnpatible 

(C) OEERMEIKG SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWaKE: Patentm Release #1.0, Version #1*25 

25 (vi) aiRREHr AEEEZCKnCH WSkt 

(A) APPEniCATnsN KCMBEER: 

(B) FTTiTNG EKCE: 

(C) CEASSIElCaiErCN: 

30 (viii) AITOKNEVAGENT INK3R&5ft3TCN: 

(A) mm: Bialsoe-todsen, Bii?git 

(ix) TELECrKUDNICATION INFQE^ffiXION: 
(A) TEimmE: +45 4444 8888 
35 (B) TETEEaX: +4S 4449 3256 

(C) TEIEK: 37304 



(2) immmm for seq id no:i: 

40 

(i) SBS^CE OIAElACIEEaSIICS; 

(A) I£H^: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STR?^NDECNESS: single 
45 (D) TOPCmaf : linear 

(ii) MOOECDIE TOTES cDNA 
(iii) HXPOraETIGAL: NO 

50 

(vi) ORIGINftL SOCIRCE: 

(A) ORGSNISM: Humicola insolens 

(B) STRAIN: DSCi 1800 
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(ix) EEAULULWi: 

(A) mmi/mn: inaftj)qat±ae 

(B) lOCSanUN: 73.. 927 

5 (iJC) EEKEURE: 

(A) MSM^/KEy: sigjj^rtdde 

(B) irxsanm: 10..72 

0 (A) mm/m[: ess 

(B) nxarroN: 10.. 927 



(3d.) SBQtlMCE DESCBIFCICN: SEQ ID NO:L: 

J.5 

GGaKX2»G ATO CCT TCX: on: OCC OE OC ^ 

Ifet Arg ser Ser Eco Lsu leu Era ser Ala Val Val Ala 
-21 -20 -15 _3^Q 

Ala leu ECO Val Ifiu Ala LBU Ala Ala Asp dy Arg ser tog 5r 

^ <^ 1^ Vro Ser GLy Trp 1^^^"^^^ 

15 20 . 

AaccaGGCTGrcmTociiMAACGccAacinKcaGa^ 

^ ^ Ag Gin Ero val Ite ^ C7S Asn Ala Asn EhB Gin ^ lS ^ ^ 

Ete Asp Ma rys ser Gly Cys Glu Eco Gly GLy Val Ala S sS Ss 
35 . ^° 55 

AU Asp Gin Thr Ep TCP Ala V&l Asn A^ Asp ihe Ala lai Gly 
60 65 70 

Ala Ala ite ser ne Ala GLy ser Asn Glu Ala Gly TCP Q?s CVS aS 
75 80 85 

4 5 ^I|r Glu leu Eir ae Ito Ser Gly ECO Ma Gly iK is ^ 

^ ^ °^ ®^ ^ ®^ ^ AAC cac TTC 

m Val Gin ser Ohr ar Gly Gly Asp ifiu Gly Ser Asn His Kie 

^« ^5 no H5 

GKPaCAACSTCOCCGGCGGCGGCGICGGCAarTTCGaCGGa^ 
Asp Ifiu Asn lie ECO Gly Gly Gly Val Gly He Hie A^ Gly Cys 
55 ^5 130 
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95 



144 



192 



240 



288 



336 



384 



432 



480 
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(XX: OiG Tlx: GGC Q3r CIG COS QBC OiG OX TiiC 528 
Pro Gin Ehe Gly Gly Leu Eto GLy Gin Arg Tyr Gly Gly lie Ser Ser 
140 145 150 

sca^AACGaGoxxiGairasGTiccxrGacGoccTCAaGc^ 576 
ArgitenGluCyBAspArgESiearaAspMalfiuLraProGlyQ^ 
155 160 165 

TSG 03C TOG GkC TOG TIC AAG AftC GOC GftC AKT CCG JiGC TTC AGO TTC 624 
10 Tcp Arg Hie Asp Tcp Hhe Lys Asn Ala Asp Asn Ero Ser Hie Ser rhe 
170 175 180 

CSP Cas GTC CaC OGC oca GCC GSG CIC GTC GCT OSC ACC GGA TCC OSC 672 
Arg dn Val GOn Cys Pro Ala du Leu Val Ala Arg Hir Gly Cys Ara 
15 185 190 195 200 

CtMAACGaCGRCGGCARCTTCCCTGCCGrcCMAaECn: 720 
Arg Asn Asp^ Asp Gly Asn Hie Pro Ala Val Gin He Pro Ser Ser Ser 
205 210 215 

20 

AC!CAGCTCTCXI3GTCAACC3tf5CCTACSJ3«3CACCAGCAOCAa3I<^ 768 

IhrSerSerPcoValAsnGlnProDirS^rlhrSer'lhrlhrSerThr 
220 225 230 

25Tcx:ACCAa:T0GAGccxGcxa(2rccz^ccTAa3Aca?ocx:AGca^ sis 
SerDirOhrSerSerltoltoValGlnEDoOteahrEroSerGlyc^ 
235 240 245 

ACTGCTGMAGGOXMGCTCMTOCGGCGGCAaTGGCIGGAGCGGCTCC 864 
30 mr Ala Glu Arg Trp Ala Gin Qts Gly Gly Asn Gly Tcp Ser Gly Cys • 
250 255 260 

AOC ACC !PGC GTC GCT GGC AGC ACT HEC AOS AflG ATT AST GflC TCG TAG 912 

IhrlhrCi^sTfelAlaGlySernircysaiirL^IleAs^ 
35 265 270 275 280 

CSMT CaG TCC GIG TftGAOSCRGG GCaCCTDSftG GGCXOTftCIG GIGGOOGCftA 964 
His GUI Qre leu 

285 

40 

ocsaaasDsacA caxrcaMCA dGaanaer TcriGiacar AarracGica TCocixxaGG 1024 
Gzmnrc&cA taakigcaat gaggaac&kt gagxac loeo 

45 
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(2) mroRMftncN tor sbq id no:2: 

(i) SBQDENCE CffiaOCIERISnCS: 

(A) lEtOH: 305 andjio acids 
5 (B) TTBE: amino acid 

(D) TOPOIDCT: linear 

(ii) msMUis TXEE: protein 

10 (xi) SBQOENCE DESCECrPTIDlT: SBQ ID N0:2: 

MetArgSerSerBjoIfiuIfiuEcoSerAlaValValAlaAlaleuEco 
-21 -20 -15 -10 

15 Val ifiu Ala Ifiu Ala Ala Asp Gly Arg Ser Ihr Arg I^r Ti^) Asp Cys 
"■5 15 10 

C5^ li^s Ito Ser Q« Gly n!rp Ala Lys Ala Pto Val Asn Gin 

^ 20 25 

20 ^ 

Val Ehe Ser Cys Asn Ala Asn Jhe Gin Arg ne Thr Asp ihe Asp Ala 
30 35 40 

Lys ser Gly Cys Glu Ero Gly Gly Val Ala Tyr Ser Ala Asp (SLn 
25 45 50 55 

TtirEcoTrpAlaValAsnAspAspHieAlalfiuGlyHieAlaAlaBir 
^0 65 70 75 

30 Ser ne Ala Gly Ser Asn Glu Ala Gly a±p cys Cys Ala Cys Tyr Glu 

80 85 90 

laa Bit Hie Hit Ser Gly Era Val Ala Gly lys Lys Met Val Val Gin 
35 95 100 105 

Ser Ihr Ser Ohr Gly Gly Asp Lsu Gly Ser Asn His me A^ Leu Aai 
no 115 120 

He Ero Gly Gly Gly Val Gly He Hie Asp Gly Cys Ihr Ero Gin Hie 
40 125 130 135 

Gly Gly Ifiu Ero Gly GEln Arg Tyr Gly Gly lie Ser Ser Arg Asn Glu 
140 145 150 155 

45 Asp Arg Hie Ero A^ Ala Ifiu lys Ero Gly cys lyr !Ecp Arg Hie 
160 165 170 

AspTrpHifiLysAsnAlaAspAsnEroSerHieSerHiBArgGlnVal 
50 . ^5 

Gin Cys Ero Ala Glu Ifiu Val Ala Arg Ihr Gly cys Arg Arg Asn Asp 
190 195 200 

Asp Gly Asn Hie Ero Ala Val Gin He Ero Ser Ser Ser Ihr Ser Ssr 
55 205 210 215 
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Pro Val Asn GLn Pro Bir Ser Bir Ser Ihr Hjr Ser Thr Ser Bur Thr 
220 225 . 230 235 

5 Ser Ser Ero Eco Val Ghi Fro Bar IHir Etro Ser (SLy Cys Bir Ala GLu 
240 245 250 
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Acg Tcp Ala Gin Cys Gly Gly Asn Gly Tcp Ser Gly Cys Uhr Dir Cys 
255 260 265 

Val Ala Gly Ser Ohr Hbac Iifs lie Asn Asp Trp T;^ Els dn Cys 
270 275 280 



Leu 



15 
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(2} immiasm seq id nd:3: 

(i) SEQUEtTCB CHSRACXERISIICS: 

(A> mOH: 1473 base paixs 

(B) TWEi nucleic acid 

(C) SIS2VNnEZnE5S: single 

(D) TOEQDXa: linear 

(ii) mLEnnE tzs^: tisxsk 

(iii) HXPOOHEXia^: NO 
(iv) iajn-SENSE: HO 

15 (Vi) ORIGINAL SOURCE: 

(A) ORGSNISU: Fusarium Gxysporum 

(B) SSSSm: Jm 2672 

(ix) FEUIXSSE: 
20 (A) NSME/KEZ: CSDS 

(B) iDcamm: 97.. 1224 
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(xi) DESCRIFEIC&T: SEQ ID N0:3: 

GJUfflTasoGG coGcrcaTic ACTicKEKa. ircmaGftA mcsaacac TcrcmcAA eo 



AacacoKacp crmaftCAA AacftRcarn? Gca&ca ims cxa tce lac act cir 114 

Met Arg Ser Tp: ar Lsu 
30 15 

CEC GCC C3I3G GCC GGC (XT CDC GC3C GODG ACT GCT GOT TCT GCa AGC GGT 162 
lai Ala Lai Ala Gly Pro leu Ala Val Ser Ala Ala Ser Gly Ser Gly 
10 15 20 

35 

Cac TCT ACT CGk riMTCGGaTTGCTCCAaGOCTTCrTGC TCT TGG AK: 210 
His Ser Bur Acg Tyr Ti:p Asp. Cys lys Pro Ser Ser Tcp Ser 
25 30 35 

40 GGA A2ffi GCT GCT GDC AAC GCr CCT GCT m ACT TGT GfiT 258 
Gly Lys Ala Ala Val Asn Ala Ero Ala leu Tax cys Asp lys Asn A^ 
40 45 50 

AAC OOC ATT TOC AAC ACC AST GOT GTC AflC GOT TtST CffiG GSP GGT GGT 306 
45 Asn Ero lie Ser Asn Thr Asn Ala Val Asn Gly GLu Gly Gly Gly 
55 60 , 65 70 

TCCGCTTftrGCTTCCAOCMCaaCTCTCaiTGGGCTGTCAaCGM 354 
Ser Ala oyr Ala cys VOar Asn Tyr Ser Eco lip Ala Val Asn Asp Glu 
50 75 80 85 

CTT GCn TftC GGT TTC GCT GCT ACC AftG ATC TOC GCT GGC TGC ORG GOC 402 
Leu Ala Tyc Gly Ehe Ala Ala Kir Lys He Ser Gly Gly Ser Glu Ala 
90 95 100 

55 
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AGC IGS TK: TCT GCT TCC TftT GCP TTC AOC ITC ACC act GGC CXr GIC 450 
SerTtpCysC^ALaCysiyrMalmlhrHieaircrto 
105 110 115 

5 AAG GGC 2iaG AAfi MG MC GTC CSS IOC ACC AAC ACT GGA GGT GOT CIC 498 
Gly lys Lys Ifet ne Val Gin Ser !Ihr Asn Thr Gly Gly Aa) Ifiu 
120 125 130 

GGCGACAACCaCITCGBTCTCJmSimsaCGGCGGrGGrGICGCT 546 
10 cay Asp Asn His Hie Asp Leu Mst Met Eco Gly Gly 61y Val Gly lie 
135 140 145 150 

TIEGa.CGGCTCCAaCTCrGaGCTCGGCAaGGCTCTCGGCGCTG0CCft6 594 
HiB 2tep Gly <Jhr Ser Glu Hie Gly lys Ala Leu Gly Gly Ala dn 
15 155 160 165 

TMGGCGCTAOXITOCTCCCQAAGCGAAKaiGaTAGCiaccaSGftGCIT 642 
Tyr Gly Gly lie Ser ser Arg Ser Glu Cys Asp Ser Tyr Ero Glu Leu 
170 175 180 



20 



CTC AAG G&C GGT TGC CRC TCG CG& TTC GAC TGG ITC GftG AAC GCC GAC 

Ifiu Lys Asp Gly cys His Trp Arg Ehe Asp Trp Hie Glu Asn Ala Aro 
185 190 195 
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25AACCCTGaCTTCAa2aTDGASCflGGn?C!VGTCCCOCAAGGCTCrcc^ 738 
Asn Era A^ Hie Cn: Hie Glu Gin Val Glji cys Rco lys Ala Leu lai 
200 205 210 

GACAICAGTGG&TCCAAGa^PGAIGACGACTOCAKJTICOCTGCCTTC 786 
30 Asp lie Ser Gly Cys Lys Arg Asp Asp Asp Ser Ser Hie Pro Ala Hie 
215 220 225 230 

AAG STT GOT AOC TCG GCC AGC AAG GCC CBG OOC TOG AGC TCC GCT AAG 834 

Lys Val Asp Hir Ser Ala Ser lys Pra Gin Pro Ser Ser Ser Ala Lys 
35 235 240 245 

AAG ACC AOC TCCGCTGCTGCTGOCGCTCaGOOCCftG AAG ACC AAG GOT 882 
Lys Bbr Hir Ser Ala Ala Ala Ala Ala Gin Pro Gin lys The Lys Asp 
250 255 260 

TOCGCTCCTGITCTCCftGAAGTCC TCC AOC AAG CCT GCC GCT OVS CCC 930 
Ser Ala Pro Val Val. Gin Lys Ser Ser Bir Lys Pro Ala Ala Gin Pro 
265 270 275 

45GAG0CTACTAAGCQCGCCGACAAGCCCCR6ACCGftCAAGCCTGrcGCC 978 
Glu ECO Thr Lys Pro Ala Asp lys Pro Gin Hir Asp lys Pro Val Ala 
280 285 290 

AGC AAG CCT GCT GCT AOC AftG COC CTC CRh OCT GTC AAC AAG OOC AAG 1026 

50 Ttir lys Pro Ala Ala Ihr lys Pro Val Gin Pro Val Asn Lys Pro Lys 
295 300 305 310 

AC& ACC CftG AAG GIC CGT GGA AOC AAA AOC OGA GGA AGC T3C COG GCC 1074 
'She Bar Gin lys Val Arg Gly Ihr Lys Hir Arg Gly Ser cys Pro Ala 
55 315 320 325 
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AAG ACT Cac GCT ACC GCC AAG GCC TOC GET GODC C3CT GCT TKT liac CSG 1122 
Lys Bir Asp Ala 2hr Ala Lys Ala Ser Val Val Pro Ala Tyr Tyr Gin 
330 335 340 

5 TST GCT GCT IOC AAG TCSC GCT TOT 0C3C AAC GGC AAC CTC GCT TGC GCT 1170 
Ojrs Gly Gly Ser Lys Ser Ala 'Tp: Pro Asn Gly Asn Ihu Ala cys Ala 
345 350 355 

ACT GGA AfiC ASfi 0X31 CTC ASG CftG AAC GftG Oac OIAC TOC CAG HGI GTC 1218 

10 Olir Sly Ser lys cys Val Lys Gin Asn Glu Tyr Tyr Ser Gin Cys Val 
360 365 370 

OOC AAC TAMOJGGIAG ATCCRiraQCT TSEGaAGAG ACTATGCCTC TCAGAAGGGA 1274 
Pro Asn 
15 375 

TOCTCTCATS AGCAGGCnG axaTTSEMA GCft!DGGCATC C3X3GAOCA2iG TGTKXSACXX: 1334 
CTGEICTACA TACTAISTCT TCKEffiOMa TA3miftfi&CA CATAGATftGC C3X2TIGTCAG 1394 

20 

CXS3VCAACTGG CT&CSiAAAGA CEKSCSCAGGC TEGTTCAAIIA CTGACACAOT O^EOIIXXIAIIA 1454 
AAAAAAAAAA AAAAAAAAA I473 
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(2} jmmi3s.m for seq id no:4: 

(1) SBSJENCE CHSRACTERISTICS: 
5 (A) I£MSQi: 376 amino acids 

(B) TISE: amino acdd 
(D) laEaUXSi linear 

(ii) MDIECQIE TiEE: p(CDtein 

10 

(3d.} SBQDENCE DESCBIEnON: SEQ ID KQ:4: 

Ifet Arg Ser Tyr Ihr Leu Lsu Ma Leu ALa Gly Ero Lsu Ala Val Ser 
15 10 15 

15 

Ala Ala Ser GLy Ser Gly His Ser Hjr Arg lyr Hrp Asp cys cys Lys 
20 25 30 

Ero Ser Cys Ser Trp Ser Gly lys Ala Ala Val Asn Ala Era Ala Lau 
20 35 40 45 

Oiir Asp lys Aan Asp Asn I*o lie Ser Asn Ihr Asn Ala Val Asn 
50 55 60 

25 Gly Cys Glu GLy Gly GLy Ser Ala Tyr Ala C5^ Ihr Asn Tyr Ser Ptco 
65 70 75 80 

Trp Ala Val Asn Asp Glu leu Ala Tyr Gly Hie Ala Ala Ihr Lys He 
85 90 95 

30 

Ser Gly Gly Ser Glu Ala Ser Ttp Cys Cys Ala Tyr Ala leu Bur 
100 105 no 

Ehe Htxc Bir GLy Ero Val Lys Gly Lys Lys Met lie Val Gin Ser Tbr 
35 115 120 125 

Asn Bit Gly GLy Asp Leu GLy A^ Asn His Hie Asp Leu Met Met Ero 
130 135 140 

40 Gly Gly GLy V&l GLy lie Hie Asp GLy Cys Thr Ser Glu Hie Gly Lys 
145 150 155 160 

Ala lau Gly GLy Ala Gin Tyr Gly GLy lie Ser Ser Arg Ser GLu Cys 
165 170 175 

45 

Asp Ser Tyr Eco GLu Leu Leu lys Asp Gly Cys His Tcp Arg Hie Asp 
180 185 190 

Trp Hie Glu Asn Ala Asp Asn Pro Asp Hie Bir Hie GLu Gin Val Gin 
50 195 200 205 

Cys Eco Lys Ala Lsu Leu Asp He Ser Gly cys Lys Arg Asp Asp Asp 
210 215 220 

55 Ser Ser Hie Ero Ala Hie lys Val Asp Thr Ser Ala Ser Lys Ero Gin 
225 230 235 240 
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Pro Ser Ser Ser Ala Lys Lys Itxr Ttxr Ser Ala Ala Ala Ala Ala Gin 
245 250 255 

5 I>ro Gin Lys Utrc Lys Asp Ser Ala Pro Val Val Gin Lys Ser Ser Obr 
260 265 270 
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Lys Pro Ala Ala Gin Pro Gin Pro Hoc Lys Pro Ala Asp Lys Pro Gin 
275 280 285 

^Ifir Asp Lys Pro Val Ala Bar Lys Pro Ala Ala TSir Lys Pro Val Gin 
290 295 300 



Pro Val Asn lys Pro lys Ohr Har Gin Lys Val Arg Gly Bir Lys Bbr 
15 305 310 315 320 

Arg Gly Ser Cys Pro Ala Lys Ihr Asp Ala Oljr Ala lys Ala Ser Val 
325 330 335 

20 Val Pro Ala Tyr Tyr Gin Cys Gly Gly Ser Lys Ser Ala Tyr Pro Asn 
340 345 350 



25 



Gly Asn Leu Ala Ala Bir Gly Ser Lys Val lys Gin Asn Glu 
355 360 365 

Tyr Tyr Ser Gin Val Pro Asn * 
370 375 



